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METHOD AND DEVICE FOR BUILDING A VARIABLE- IiBNGTH ERROR- CORRECTING CODE 

FIELD OF THE INVENTION 

The present invention relates to a method of building a variable length error 
code, said method comprising flie steps of : 

(1) initializing the needed parameters : minimum and maximum length of 
codewords Li and Lmax respectively, free distance dfcee between each codeword (said 
distance dfrec being for a VLEC code C the minimum Hanmaing distance in the set of 
all arbitrary extended codes), required number of codewords S ; 

(2) generating a fixed length code C of length Li and minimal distance bmm, 

with bmin = Tnin {bk ; k = 1, 2, , R}, bk = the distance associated to the codeword 

length Lk of code C and defined as the minimum Hamming distance between all 
codewords of C with length Lk, and R = the mmiber of different codeword lengths in 
C, said generating step creating a set W of n-^bit long words distant of d ; 

(3) listing and storing in the set W all the possible Li - tuples at the distance 
of dmin from the codewords of C (said distance dmin for a VLEC code C being the 
minimum value of all the diverging distances between all possible couples of 
different-length codewords of C), and, if said set W is not empty, doubling the 
mmiber of words in W by affixing at the end of all words one extra bit, said storing 
step therefore replacing the set W by a new one having twice more words than the 
previous one and the length of each one of these words being Li + 1 ; 

(4) deleting all the words of the set W tiiat do not satisfy the Cmm distance with 
all codewords of C, said distance Cnrin being the minimum converging distance of the 
code C ; 

(5) in the case where no word is found or the maximum number of bits is 
reached, reducing the constraint of distance for finding more words ; 

(6) controlling that all words of the set W are distant of bmin, the foxmd words 
being then added to the code C ; 

(7) if the required number of codewords has not been reached, repeating the 
steps (1) to (6) until the method finds either no further possibility to continue or the 
required number of codewords ; 

(8) if the number of codewords of C is greater than S, calculating on the basis 
of the structure of the VLEC code, the average length AL obtained by weighting 



wo 2004/084417 PCT/IB2004/000834 

2 

each codeword length with the probability of the source, said AL becoming the 
ALmin, if it is lower than ALmin, with ALmin = the minimum value of AL, and the 
corresponding code structure being kept in memory. 

The invention also relates to a corresponding device. 

BACKGROUND OF THE INVEOTION 

A classical communication chain, illustrated in Fig.l, comprises, for coding 
the signals coming from a source S, a source coder 1 (SCOD) followed by a channel 
coder 2 (CCOD) and, after the transmission of the coded signals thus obtained 
through a channel 3, a channel decoder 4 (CDEC) and a source decoder 5 (SDEC). 
The decoded signals are intended to be sent towards a receiver. Variable-length 
codes (VLC) are classically used in source coding for their compression capabilities, 
and the associated channel coding techniques combat the effects of the real 
transmission channel (such as fading, noise, etc.). However, since source coding is 
intended to remove redxmdancy and channel coding to re-introduce it, it has been 
investigated how to efficiently coordinate these techniques in order to improve the 
overall system while keeping the complexity at an acceptable level. 

Among the solutions proposed in such an approach, the variable-length error 
correcting (VLEC) codes present the advantage to be variable-length while 
providing error correction capabilities, but building these codes is rather time 
consiraaing for short alphabets (and become even prohibitive for higher length 
alphabets sources), and the construction complexity is also a drawback, as it will be 
seen. 

First, some dejBbaitions and properties of the classical VLC must be recalled. A 
code C is a set of S codewords {ci, C2, C3,. . Cj,. . . cs}, for each of which a length £{ 
= |ci| is defined, with ^1 ^2 ^3 ^....^i^^-.^s without any loss of generahty. 
The number of different codeword lengths in the code C is called R, with obviously 

R and these lengths are denoted as Li, L2, L3, ,Li, Lr, with Li < L2 < 

L3 < < Lr. A variable-length code, or VLC, is then the structure denoted by 

(si@ Li, S2@ L2, S3@ L3, , sr@ Lr), which corresponds to Si codewords of 

length Li, S2 codewords of length L2, sa codewords of length L3, , and sr 

codewords of length Lr. When usmg a VLC, the compression efficiency, for a given 
source, is related to the number of bits necessary to transmit symbols from said 
source. The measure used to estimate this efficiency is often the average length AL 
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of the code, i.e. tiie average number of bits needed to traosmit a word, and said 
average length is given, when each symbol ai is mapped to the codeword by the 
following relation (1) : 

AL = f;^i.P(aj) (1) 
which is equivalent to the relation (2) : 



AL 

1^ 



where, for a data source A, the S soxirce symbols are denoted by {ai, a2, as, , as} 

and P(ai) is the respective probability of occurrence of each of these symbols, with 
EP(ai) = 1 (from i == 1 to i = S). If ALmin denotes the minimal value for the average 
length AL, it is easy to see that when ALmm is reached, the symbols are indexed ia 
such a way that P(ai) ^(ai) > P(a3) ^ . . >P(ai) >. . .P(as). In order to encode the 
data in such a way that the receiver can decode the coded information, the VLC 
must satisfy the following properties : to be non-singular (all the codewords are 
distinct, i.e. no more than one source symbol is allocated to one codeword) and to be 
uniquely decodable (i.e. it is possible to map any string of codewords 
unambiguously back to liie correct source symbols, without any error). 

An introduction and a presentation of different distances that are useful when 
reviewing some general properties of the VLC codes will then help to recall the 
notion of error-correcting property used in the VLEC code ttieory : 

(a) Hamming weight and distance : if w is a word of length n with w = (wi, 
W2,. . Wn), the Hamming weight of w, or simply weight, is the number W(w) of 
non-zero symbols in w : 



and, if Wi and W2 are two words of equal length n wifli Wi == (wii, Wi2, Wi3,. • Wm) 
and i = 1 or 2, the Hamming distance (or, simply, distance) between wi and W2 is the 
number of positions in which wi and W2 differ (for example, for the binary case, it is 
easy to see that : 

H(Wi, W2) = W(wi + W2) (4) 
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where the addition is modulo-2). However, the Hamming distance is by definition 
restricted to fixed-length codes, and other definitions will be defiaied before 
considering VLEC codes. 

(b) let fi = w/ wj . . . . wi be a concatenation of n words of a VLEC code C, then 
the set Fn = {fi : |fi| = N} is called the extended code of C of order N. 

(c) miniiniun block distance and overall miniinuni block distance : the 
miTn'mum block distance bk associated to the codeword length Lk of a VLEC code C 
is defined as the ndnimum Hamming distance between all distinct codewords of C 
with the same length Lk : 

bk = min {H(ci, cj) : Ci, cj g C, i \ci\ = |cjl =Lk} for k = 1,. . R (5) 
and the overall minimum block distance bmin of said VLEC code C, which is the 
minimum block distance value for every possible length Lk, is defined by : 
bn,in = niin {bk:k= 1,...R} (6) 

(d) diverging distance and minimum diverging distance : the diverging 
distance between two codewords of different length Ci = Xi^ Xj^, . . -Xj^. and 

Cj=Xji Xj2.... Xj^jOf a VLEC codec, where Ci,Cj e C, = |cjl and -fj = |cjl with 

£i > £jy is defined by : 

D(q,Cj) = H(x^^x,2....Xi^.,Xj^ Xj^ Xj,. ) (7) 

i.e. it is also the Hanmiing distance between a £j - length codeword and the - 
length prefix of a longer codeword, and the minimum diverging distance dmin of said 
VLEC code C is tiie mininmm value of all the diverging distances between all 
possible couples of codewords of C of unequal length : 

dmin = min { DCq ,Cj) : q.cj e C.|ci| ^ |cj| } (8) 

(e) convergLtig distance and minimum converging distance : the converging 
distance between two codewords of different length Ci = x ij x . - -x j^. and 

Cj = xji . . Xj^j of a VLEC code C, where Ici| = > |cjl= ^j, is defined by : 

C(ci, cj) = H (x,,^ X|,^^^^^^ ... jc,,^ ,xj,xj2 ....xj,^ ) (9) 

i.e. it is also the Hanmiing distance between a gj - length codeword and the £j - 
lengtti suffix of a longer codeword, and the minimum converging distance of said 
VLEC code C is the minimimoL value of all the converging distances between all 
possible couples of C of unequal Iragth : 
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cmin = min {C(c, ,cp : c,,c^ e C,|c.l ^^1} (10) 
(f) free distance : the free distance of a code is the miniTn u m Hamming 
distance in the set of all arbitrary long paths that diverge from some conomon state Si 
and converge again in another common state Sj, with j > i : 

dfree = min {H(fi, ^0 : fi, fj e Fn,N=1,2,...., «>} (11) 
Following the structure model used for a VLC, it is therefore possible to 
describe the structure of the VLEC code C by the notation : 

Si @Li,bi ; S2@L2,b2 ; .... ; Sr @ Lr, bR ; dmim Cmin (12) 
where there are Sj codewords of length Lj with minimum block distance bi, for all i = 
1, 2,. . . R, (it is recalled tiiat R is tiie number of different codeword lengths) and 
Tnmi miim diverging and convergmg distances dmin and Cmin. The most important 
parameter of a VLEC code is its free distance dfree, which influences greatly its 
performance in temis of error-correcting capabiUties, and it can be shown that the 
free distance of a VLEC code is bounded by : 

dfree 21 niin (bmin, dmin + Cmin) (13) 

These definitions being recalled, the state-of-the-art in VLEC codes 
construction will be now described more easily. The first types of VLEC codes, 
called a -prompt codes and introduced in 1974, and an extension of this family, 
called a ti,t2»....,tR -prompt codes, have both the same essential property : if one 
denotes by a(ci) the set of words that are closer to Ci than to any codeword Cj, with j 
# i, no sequence in a(ci) is a prefix of a sequence in another a(q) . The constraction 
of these codes is very simple, and the constmction algorithm is adjustable by the 
number of codewords at each length, which makes possible to find the best prompt 
code for a given source and a given dfree- However, this best code performs poorly in 
terms of compression performance. 

A more recent construction, allowing the constraction of a VLEC code from 
the generator matrix of a fixed-length linear block code, was proposed in the 
document "Variable-length error-correcting codes" by V.Buttigieg, Ph.D.Thesis, 
University of Manchester, England, 1995^ Called code-anticode constmction, this 
algorithm relies on line combinations and colxmm permutations to form an anticode 
at the rightmost column. Once the code-anticode generator matrix is obtained, the 
VLEC code is simply obtained by a matrix multiplication. 
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This technique has however several drawbacks. First, there is no explicit 
method to JBnd the needed line combinations and column permutations to obtain the 
anticode. Moreover, the constmction does not take into account the source statistics 
and, consequently, often reveals itself sub-optimal (one can find a code with smaller 
average length by a post-processing on the VLEC code). In the same document, the 
author has then proposed an improved method, called Heuristic method, that is 
based on a computer search for building a VLEC code giving the better known 
compression rate for a specified source and a given protection against errors, i.e. a 
code C with specified overall minimimi block, diverging and convergmg distances 
(and hence a minimum value for dfree) and with codeword lengths matched to the 
source statistics so as to obtain a minimum average codeword length for the chosen 
firee distance and the specified source (in practice, one takes : bmin = dmin + Cmin = 
dfrec, and : dmin = [dfree/2]. 

The main steps of this Heuristic method, which uses the following 
parameters : minimum length Li of codewords, maximum length Lmax of codewords, 
firee distance dfree between each codeword, number S of codewords required, are 
now described with reference to the flowcharts of Figs.2 to 4. 

To start the computer search ("Start"), all the needed parameters must be first 
specified : Li (the minimum codeword length, which must be at least equal to or 
greater than the minimum diverging distance required), Lmax (the maximum 
codeword length), the different distances between codewords (dfree, bmin, dmin, Cmin), 
and S (the niimber of codewords required by the given source), and some relations 
are set when choosing these parameters : 

^ dmin 
bmin ^ dfree 
dmin Omin ~ dfree 

The first phase of the algorithm, referenced 1 1, is then performed : it consists 
in the generation of a fixed length code (put initially in C) of length Li and minimal 
distance bmin, with a maximiun number of codewords. This phase is in fact an 
initialization, performed for instance by means of an algorithm such as the greedy 
algorithm (GA), presented in Fig.5, or the natajority voting algorithm (MVA), 
presented in Fig.7, or a new proposed variation, denoted by GAS (Greedy Algorithm 
by Step), which consists of a variation of the two above mentioned ones. The GAS 
consists in the search method used in the GA, where instead of deleting half of the 
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codewords, only the last codeword of the group is deleted. These two algorithms are 
useful to create a set W of n-bit long words distant of d (in practice, it may be noted 
that the MVA finds more words than the GA, but it asks too much time for only a 
small improvement of the compression capacity, as shown in the tables of Figs.6 and 
8, which compare, respectively for the GA and for the MVA, the best code 
structures obtained with different values of dfree for the 26-symbol English source 
defined in the table of Fig.9. 

The second phase of the algorithm, corresponding to the elements referenced 
21 to 24 (21+22 = operation "AO" ; 23+24 = operation "A2") in Fig.2, consists in 
listing and storing (step 21) in a set called W all the possible Li - tuples at the 
distance of dmin from the codewords in C. If dmm ^bnun, then W is empty. If this set 
W of all the words satisfying the minimum diverging distance to the current code is 
not empty (reply NO to the test 22 : [ w| =0 ?), the number of words in W is doubled 

by increasing the length of the words by one bit by affixing first a "0" and then a "1" 
to the rightmost position of all the words in W (step 24), except if the maximum 
number of bits is exceeded (reply YES to the test 23). At the output of said step 24, 
this modified set W has twice more words than the previous W, and the length of 
each one is Li + 1. 

The tiiird phase of the algorithm, corresponding to the elements 3 1 to 35 ( = 
operation "A3" in Fig.2), consists in deleting (step 3 1) all the words of set W that do 
not satisfy the Cmin distance (minimum converging distance) with all the codewords 
of C (i.e. in keeping and storing in a new W only the words which satisfy said 
minimum converging distance, the other ones being discarded). At this point, the 
new set W is a set of words which, when compared to the codewords of C, satisfy 
the required minimmn diverging and converging distances (both dmin and Cmin 
distances) with the codewords of C. If that new set W is not empty (reply NO to the 
test 32 : |w| = 0 ?) one selects in W (step 33) the maximum number of words to 
satisfy the minimum block distance, in order to ensure that all the words of the set 
W, being of the same length, have a minimum distance at least equal to bmin- At the 
end of this step 33, realized with the GA or the MVA (note that in this case, the 
initial set used for the GA or the MVA is the current W and not a n-tuples set), the 
words thus obtained are added (step 34) to the codewords already in C. 
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If no word is found (i.e. W is empty) at the end of the step 21 (reply YES to 
the test 22 : |w] = 0 ?) or if the maximum number of bits is reached or exceeded 

(reply YES to the test 23), one enters the fourth phase of the algorithm (steps 41 to 
46, illustrated in Fig.3 and also designated by the operation "Al" in said figure), 
which is used in order to unjam the process by inserting more Uberty of choice, more 
particularly by afBxing to all words in W extra bits (several bits at the same time) 
such that the new group contains more bits Hian the old one. If there are enough 
codewords in the last group (successive tests 41 and 42, for verifying the nimiber of 
codewords in the last group, and if there are previous groups), some of them are 
deleted from said group (as described above), such deletions allowing to reduce the 
distance constraint and to find more codewords than before. As a matter of fact, the 
classical Heuristic method thus described begins with the maximum of codewords 
with the short length, maps them with the high probability symbols and tries to 
obtain a good compression rate, but sometimes the size of the small lengths sets are 
incompatible with the required number of codewords S. In this optic, easing a few 
codewords provides more fireedom degrees and allows to reach a position where the 
initial requirements on distance and number of symbols for the code can be met. 
This deletion process is repeated until it remains a maximum of one codeword for 
each length. If W is empty at the end of the step 3 1 (reply YES to the test 32 : [w] = 

0 ?), the steps 23, 24, 31, 32 are repeated. If the required number of codewords has 
not been reached (reply NO to the test 35 provided at the end of this third phase), the 
steps 21 to 24 and 31 to 35 must be repeated until said steps find that either there are 
no ftuUier possible words to be foimd or the required number of codewords is 
reached. 

If said required number of codewords has been reached (i.e. the number of 
codewords of C is equal to or greater than S (reply YES to the test 35), the structure 
of the VLEC code thus obtained is used in a fifth part, including the steps 51 to 56 
(illustrated in Fig.4, and also designated by the operation "A4" in said figure), in 
order to calculate the average length AL. This is done by weighting each codeword 
length with the probability of the source, and comparing it to the current best one. If 
said average length AL of this VLEC code is lower than the minimized value of AL 
(= ALmin), this AL becomes the ALmin, and this new AL value and the corresponding 
code structure are kept in the memory (step 51). These steps 51 and following (fifth 
part ; operation "A4") allow to come back, within the algorithm, towards previous 
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groups, while the other phases of said algorithm are always performed on the current 
group. The stepsize for such a feedback operation is one, i.e. this feedback action 
can be considered as exhaustive. 

To continue this search of the best VLEC code, it is necessary to avoid 
keeping the same structure, which would lead to a loop in the algorithm. The last 
added group of the current code is deleted (steps 52, 53), the deletion of shorter 
length codewords allowing to find more longer length codewords (test 54 : number 
of codewords in group greater flian 1 ?), and some codewords (half the amount for 
ttie GVA ; the "best" one for the MVA) of the previous group are deleted (step 55), 
in order to re-loop (step 56) the algorithm at the beginning of the step 21 (see Fig.2) 
and find different VLEC structures (the nxxmber of deleted codewords depends on 
which method is used for selecting the words : if the GA method is used and one 
wants to obtain a linear code, it is necessary to delete half of the codewords, while 
with the MVA method only one codeword, the best one, is deleted, i.e. the one that 
allows to find the more codewords in the next group). 

However, the Heuristic method thus described often considers very xmlikely code 
structures or proceeds with such a care (in order not to miss anything) that a great 
complexity is observed in the implementation of said method, which moreover is 
rather time consuming and can thus become prohibitive. It has therefore been 
proposed, in a European patent application filed on October 23, 2002, with the filing 
number 02292624.0 (PHFR0201 10), an improved construction method with which it 
is possible to gain in complexity by avoiding these drawbacks, said method of 
building a variable length error code comprising, more precisely, the steps of : 

(1) initializing the needed parameters : minimum and maxunum length of 
codewords Li and Lmax respectively, firee distance dg^e between each codeword (said 
distance dfree being for a VLEC code C the minimum Hamming distance in the set of 
all arbitrary extended codes), required nimiber of codewords S ; 

(2) generating (step 1 1) a fixed length code C of length Li and minimal 

distance bmin, with bmin = min {bk ; k = 1, 2, , R}, bk = the distance associated to 

the codeword length Lk of code C and defined as the minimum Hamming distance 
between all codewords of C with length Lk, and R = the number of difiEerent 
codeword lengths in C, said generating step 1 1 creating a set W of n-bit long words 
distant of d ; 
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(3) listing and storing (step 21) in the set W all the possible Li - tuples at the 
distance of dmin from the codewords of C (said distance dmm for a VLEC code C 
being the piiTiiTrmm value of all the divergmg distances between all possible couples 
of different-length codewords of C), and, if said set W is not empty, doubling the 
number of words in W by affixing at die end of all words one extra bit, said storing 
step therefore replacing tihie set W by a new one having twice more words than the 
previous one and the length of each one of these words being Li + 1 ; 

(4) deleting (step 3 1) all the words of the set W that do not satisfy the Cmin 
distance with all codewords of C, said distance Cmin being the minimum converging 
distance of the code C ; 

(5) in the case where no word is found or the maximum mmiber of bits is 
reached, reducing (step 41) the constraint of distance for finding more words ; 

(6) controlling that all words of the set W are distant of bmin, the found words 
being then added to the code C (step 34) ; 

(7) if (step 35) the required number of codewords has not been reached, 
repeating the steps (1) to (6) (i.e. the steps 21 to 35) until the method finds either no 
further possibility to continue or the required number of codewords ; 

(8) if the number of codewords of C is greater than S, calculating (operation 
A4), on the basis of the structure of the VLEC code, the average length AL obtained 
by weighting each codeword length with the probability of the source, said AL 
becoming the AUun, if it is lower than ALmin, with ALmin = the minimum value of 
AL, and the corresponding code structure being kept in memory ; 

said building method being moreover such that at most one bit is added at the end of 
each word of the set W. 

Simulations show that, with the classical Heuristic method, almost none of 
the obtained best codes has a hole (i.e. a length jump in its structure length). It is 
then considered, in the previously cited European patent appUcation, that most good 
codes do not have jmnp of length and, therefore, ttiat the set of exanadned VLEC 
codes can be reduced accordingly (which reduces the simulation time and the 
complexity of implementation of the method, without modifying much the AL). 
Following this hypothesis, the method has been, according to said European patent 
application, modified by avoiding to add more than one bit at the end of each word 
of the set W. The corresponding implementation (improved Heuristic construction 
method^ also called "noHole optimization" method) is illustrated in Figs 10 and 11, 
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which show the two parts of a flowchart corresponding to said method (the elements 
that are identical to the ones observed in Figs.2 to 4 being designated with the same 
references). With respect to the flowchart of Figs.2 to 4, the parts that, with respect 
to the classical Heuristic technique, are useless for the implementation of the 
improved method have been cancelled : 

(a) if W is empty at the end of the step 31 (reply YES to the test 32 : |w| = 0 

?), the next phase is now (see Fig.lO) not the repetition of the steps (23, 24, 31, 32), 
but, according to said "noHole" method, the establishment (in place of said 
repetition) of a direct connection 91 towards the input of the circuit carrying out the 
operation 55 (deletion of some codewords, or of the best one, before a repetition of 
the steps 21 to 24 and 31 to 35), said operation 55 being then, as previously, 
followed by the operations 21 and following. 

(b) the fourth phase of the method is now reduced to one step, the operation 
41, which is the test "Number of codewords in last group = 1 ?" . If the reply is NO, 
a direct link is established with the input of the step 55 (connection 91), in view of 
carrying out said operation 55, and then the operations 21 and following. If the reply 
is YES, a connection 92 is established with the input of the set of operations 52 to 
54. 

The results thus obtained are presented in the table of Fig. 12 for the 26 symbol 
English source when using tiie GAS method for selecting codewords. It can be seen, 
when comparing with results presented in Fig. 13, that although the result is not 
completely optimal for dfree = 3 (the code structure has a hole at length L = 1 1), the 
AL rise is really acceptable when one considers that there is both strictly no 
degradation for the other dg^e values and a gain of time between 2,5 and 4. The same 
remarks can be appUed when comparing the present solution with the ones obtained 
in Fig.7, where the MVA complexity effect is clear. Similarly, applying the noHole 
optimisation with the GA method for selecting codewords leads to a time gain at the 
only expense of a slight AL rise for dfi«e=3. Finally, Fig.5 shows on the other hand 
that the current solution offers better AL for an accq>table gain of time, the noHole 
optimisation compensating ahnost entirely the complexity induced by the GAS. 

However, with the method thus described in said European patent 
appUcation, there are cases where there are too many small length codewords in the 
generated VLEC code. It has then been proposed, in another European patent 
appUcation filed on March 11, 2003, with the fiUng number 03290604.2 
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(PHFR030026), another improved building method according to which the group 
deletion is not only performed with the last obtained codewords group, but more 
generally with groups up to a given length value group, in order to make possible to 
go back directly, and therefore very quickly, to smaller lengths, i.e. to skip many 
algorithm steps in cases where there are too many small length codewords. More 
precisely, denoting by Ls (with s for : skip) the length to which the algorithm will 
skip back to in the codeword deletion stage, it has been proposed to skip parts of the 
original Heuristic algorithm by carefully jumping to lower lengths when looking for 
codewords to be deleted (however, when the considered codewords group length L 
is smaller than a preset value Ls, it is obviously better to apply the previous method, 
and the deletion is then done Avithin the group of length L). The length comprised 
between Li and Ls are consequently called "free lengths", i.e. lengths with a freedom 
degree, as they are decremented one by one in the search process (when the number 
of free lengths grows up, the simulation time also increased, exponentially). This 
method, called "Ls optimization adding", is depicted in the flowchart formed by the 
association of Fig. 10 (unchanged part of the previous method, the so-called noHole 
optimization method) and Fig. 14 (modified part of the noHole optimization 
method). 

Said Fig. 14 is adapted from Fig. 1 1 according to the following 
indications. The last added group of the current code is deleted, but only if (test 61) 
the codeword length of this previous group is (reply NO to the test 61) lower than or 
equal to Ls (the steps that follow the test 61 are then the same as previously : steps 
53, 54, and 55 or 52 at the output of the test 54). If said codeword length is greater 
than Ls (reply YES to the test 61), an additional step 62 is provided for going to 
group with Ls-bit long codewords and deleting all groups with more than Ls bits. At 
the output of the step 62, the same steps 54, 55 as previously are provided. In 
practice, simulation results show that good compression rates can be obtained for Ls 
< L(max), where L(max) is the maximal authorized codeword length (it can be noted 
that increasing the value of Ls results in an improvement of the AL value until a 
constant floor — the best value - is reached, and this behaviour then suggests a 
possible dynamic choice of Lg, starting with Ls = Li and incrementing it imtil said 
floor is reached). It seems however fliat, unfortunately, said Ls optimization method 
may be not consequent enough to decrease the computation time in all situations. 



wo 2004/084417 PCT/IB2004/000834 

13 

SUMMARY OF THE INVENTION 

It is therefore an object of the invention to propose an improved construction 
method with which this drawback is avoided and better codes can be obtained, with 
an acceptable computation tune. 

To this end, the invention relates to a method such as defined in the 
iutroductory part of the description and which is moreover characterized in that, 
considering that all distributions of number of codewords for the best VLEC codes 
have a similar curve allure of a bell shape type, it is defined an optimal length value 
Lm until which the number of codewords increases with their length, whereas it 
decreases aft^ said value Lm, said definition allowing to apply the so-called "Ls 
optimization" method with avoiding the edges of the curve and to work locally. 

According to a possible improved implementation, the invention relates to a 
similar method (i.e. also such as defined in the introductory part of the description), 
but which is now preferably characterized in that the deletion is realized not only in 
the last obtained group but also in the group of a given length value, in order to go 
back very quickly to smaller lengths, and, considering that all distributions of 
number of codewords for the best VLEC codes have a similar curve allure of a bell 
shape type, it is defined an optimal length value Lm until which the number of 
codewords increases with their length, whereas it decreases after said value Lm, smd 
definition allowing to apply the so-called "Ls optimization" method with avoiding 
the edges of the curve and to work locally. 

It is also an object of the invention to propose a device for carrying out said 
construction method. 

To this end, the invention relates to a device for carrying out a variable length 
error code building method according to anyone of the two solutions thus proposed. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The present invention will now be described, by way of example, with 
reference to the accompanying drawings in which : 

- Fig.l depicts a conventional communication chaimel ; 

- Figs. 2 to 4 are the three parts of a single flowchart illustrating the main steps 
of a conventional method used for building a VLEC code (and called Heuristic 
method) ; 
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- Fig. 5 illustrates an algorithm (called greedy algorithm, or GA) used for the 
initialization of the method of Figs. 2 to 4, and Fig,6 is a table giving various VLEC 
codes for a source constructed with the Heuristic construction using said algorithm 
of Fig.5 ; 

- Fig.7 illustrates another algorithm (called majority voting algorithm, or 
MVA) used for the initialization of the method of Figs. 2 to 4, and Fig.8 is another 
table giving various VLEC codes for a source constructed with the Heuristic 
constmction using said algorithm of Fig.7 ; 

- Fig.9 is a table giving for the 26-symbol English source the correspondence 
between the source symbol and its probability ; 

- Figs. 10 and 1 1 are the two parts of a single flowchart illustrating an 
implementation of an improvement of the conventional method illustrated in Figs.2 
to 4; 

- Fig. 12 is another table giving various VLEC codes for the same 26 symbol 
EngUsh source as considered in the tables of Figs. 6 and 8 and using the GAS ; 

- Fig. 13 is another table giving various VLEC codes for the same source as in 
Fig. 12 and using both the GAS previously mentioned and the building method 
according to the improvement illustrated in Figs. 10 and 1 1 ; 

- Fig.l4 shows the modification of the part of flowchart of Fig.ll according to 
another improvement of the conventional method illustrated in Figs.2 to 4 ; 

- Fig. 15 shows, with respect to the single flowchart formed by the association 
of Figs.lO and 14, a modification of the lower part of flowchart of Fig.lO when the 
method according to the invention is carried out (only this lower modified part, with 
respect to the original Fig.lO, is shown) ; 

- Fig. 16 is a table illustrating the results obtained for the 26-symbol English 
source when the previous method of Fig. 14 ("Ls optimization" method) is carried 
out ; 

- Fig. 17 is a table illustrating the results obtained for the 26-symbol English 
source when the method according to the invention is carried out. 

DETAILED DESCRIPTION OF THE INVENTION 

Considering the results of some simulations made on the basis of the classic 
Heuristic method or the modified ones ("noHole optimization" method, "Ls 
optimization" method), it appears that all distributions of number of codewords for 
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the best VLEC codes found in said simulations have a similar aspect : the number 
Nc(L) of codewords at length L versus the codeword length L is a curve generally 
exhibiting a bell shape. It means that, up until a given length Lm (with m for : 
middle), the niunber of codewords increases with the length, whereas, after said 
length Lm, the number of codewords decreases. 

With respect to the single flowchart fomied by the association of Fig.lO and 
Fig. 14, the previous remark is therefore exploited by means of a modification of the 
lower part of flowchart of Fig, 10 : the metiiod according to the invention is 
illustrated in Fig. 1 5, which must be considered together with the remaining upper 
part of Fig. 10 (as explained in the following paragraph describing how said lower 
part of Fig.lO contributes to the implementation of the method according to the 
invention): and with said Fig. 14 in order to form the new single flowchart 
corresponding to the implementation of said method. This method, called "Lm 
optimization", is an adaptation of the "Ls optimization" method, and this adaptation 
is done by means of an introduction of the technical measures now described. 

According to said measures, a test circuit 71 ("Size of set W ?"), a test circuit 
72 ("Word length of W ?"), and a computing circuit 73 ("Put size of W = size of last 
group") are added to the solution shoAvn in Fig.lO, between the circuits 33 et 34 : the 
method is indeed implemented after tihe circuit 33, where the set W is in accordance 
with all the different distances, and before the circuit 34, that performs the addition 
of the words of W to the code C. Between these two circuits 33 and 34, additional 
tests are carried out according to the invention. 

First, one assumes that the word length Lw in W is lower than Lm. To respect 
the aUure of the curve, the size of the set W must not be smaller than the last group 
one. If said size is smaller than the size of the last group (reply YES to the test 71), 
the steps 21 to 24 and 31 to 35 must be repeated, as explained above when 
describing the previous implementations. If said size of W is greater than the size of 
the last group or if the word length Lw is greater than Lm (reply NO to the test 71), 
the test 72 is performed. 

If the word length Lw is now greater than Lm and lower than L(max) - 2 , the 
W size must be lower than the last group one. If it is not trae (reply YES to the test 
72), the number of words in W is too high and it must be set not more than the last 
group size, which is done in the circuit 73. Then the words in W are added to C 
(circuit 34), as previously with the "Heuristic", "noHole optimization" or "Ls 
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optimization" method. If it is true (W size lower than the last group one : reply NO 
to the test 72), nothing is done before adding the words in W to the code C (circuit 
73). To give some freedom to the method, which will allow to find the code allure 
with little oscillation, the allure constraint is tested not globally but locally, i.e. the 
5 inclination is verified for otily two lengths : the length Lw of W and the length 

L(last_ group) of the last group- 
By comparing the results obtained with the "Ls optimization" method (table of 
Fig, 16) and the ones obtained with the present "Lm optindzation" method, illustrated 
in the table of Fig. 17, it appears fliat the "Lm optimization" gives the same best code 
10 with a small gain of time. It can be explained by the fact that sources with a few 

nimiber of codewords are already quickly dealt with the "Ls optimization" method, 
the "Lm optimization" one showing its interest especially when there is a higher 
number of codewords. 
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